Use big int for integer type without max value #65

SiebrenW · 2024-01-17T12:38:50Z

This addresses the issue #54
The exi_unsigned_t type is being reused (and its array size is increased to 20) to make an exi_signed_t that's effectively a big int in this case.
For diagnostics we may want to make a tostring-like function, but I haven't really thought about how to do that yet (probably just printing each byte in hex)
This does expose the exi_unsigned_t and exi_signed_t in the other encode/decode source/header files.

I generated din, iso2 and iso2 and all of it seems to build. I just haven't tested the encoded stream or tried to decode a known stream yet. I will test it if the code is fine-ish. If you want more fundamental changes, then I don't think that makes sense to create a test suite for that.

SiebrenW · 2024-01-18T12:25:55Z

I tested it with the following code: https://gist.github.com/SirTates/e840a6cbc3831b3b8134d7d4098d2682 and it worked with my last commit for the decode (I already figured this was missing the non-strict SE, but I copy pasted the template from another jinja that was supposed to be trivial too)

I must comment on the usability of the library compared to OpenV2G: I sorely miss the flush function (as seen with my hacky if bit_count != 0. I know there are bugs in there), which is not ideal. I'm not too fond of the exi_bitstream_get_length, and I only figured out its existence after the tests.

Regarding the performance I'm thinking of rewriting the ..write_bits function to write entire bytes if the nbits exceeds 8 instead of bit for bit. The compiler does not optimise this, so the code needs to be optimised to be less wasteful.

TGruett · 2024-01-22T11:51:52Z

Hi,
Have you also tested it with larger SerialNumbers? I have just tried it with a 39-digit number (16 octets) and it comes out differently than I expected.
To be more precise: I converted a CertificateInstallationReq with a 39-digit SerialNumber to EXI with another encoder and decoded it with cbexigen. The value that comes out is different from the one I put in. I've encoded 16 octets but get 19 octets after decoding.

However, the implementation worked for me with smaller numbers

SiebrenW · 2024-01-23T15:09:02Z

Hi, Have you also tested it with larger SerialNumbers? I have just tried it with a 39-digit number (16 octets) and it comes out differently than I expected. To be more precise: I converted a CertificateInstallationReq with a 39-digit SerialNumber to EXI with another encoder and decoded it with cbexigen. The value that comes out is different from the one I put in. I've encoded 16 octets but get 19 octets after decoding.

However, the implementation worked for me with smaller numbers

It's whacking the original number in 7 bit chunks with 1 bit padding every byte. This can be written to/read directly from the EXI stream. Up to an int64 you can use the exi_basetypes_convert_64_to_signed and exi_basetypes_convert_64_from_signed, but that's as far as C native types go. Anything under 7 bits will fit just fine in the first octet.

That does remind me that 20 bytes may not be sufficient, because the padding is basically using 2.5 bytes, we should have 23 bytes to fill 20 bytes' worth of big int.

TGruett · 2024-01-24T08:34:14Z

Ah OK, I see. Would it be possible for you to create a function to build a 'real' 8 byte (uint8) array from the exi_signed / exi_unsigned?
I want to use the decoder to display the X509SerialNumber and this would help me to print the real-world example with 16 octets at least as a hex string.

TGruett · 2024-01-25T10:37:29Z

I came across a post on Stack Overflow that appears to provide a solution to our problem: https://stackoverflow.com/questions/32670626/remove-nth-bit-from-buffer-and-shift-the-rest.

I tried to create the unpadded uint8 array using the function mentioned in the post, but it did not work correctly. It is possible that I only made a mistake with the LSB and MSB or something. I am not very familiar with the EXI encoding.

Maybe the mentioned post could help you with this issue. Otherwise I'll try to get back to it next week, if I have the time

SiebrenW · 2024-01-25T14:14:16Z

I tested it for a bit and I came to this solution:

int exi_basetypes_convert_bytes_from_unsigned(const exi_unsigned_t* exi_unsigned, uint8_t* data, size_t* data_len, size_t data_size)
{    
    const uint8_t* current_octet = exi_unsigned->octets;
    uint16_t temp = 0;
    *data_len = 0;
    size_t total_offset = 0;

    for (size_t n = 0; n < exi_unsigned->octets_count; n++) {
        temp = temp + ((uint16_t)(*current_octet & EXI_BASETYPES_OCTET_SEQ_VALUE_MASK) << (total_offset));
        total_offset += 7;
        if (total_offset >= 8) {
            if (data_size == *data_len) {
                return -1;
            }
            total_offset -= 8;
            data[(*data_len)++] = temp & 0xFF;
            temp >>= 8;
        }
        current_octet++;
    }
    if (total_offset != 0) {
        if (data_size == *data_len) {
            return -1;
        }
        data[(*data_len)++] = temp & 0xFF;
    }
    return 0;
}

You can try to use that for now. It does happen to reverse the order of the bytes if you don't mind, but I only had a 30 minute break.
I may add this to the PR if this is somehow essential, but currently I'm wondering if it is. This is just for stringifying it, but the serialnumber is for identification and the 7-bit octets do that just fine.
Edit: I remembered that I didn't take the data_size into account. Added the check.

TGruett · 2024-01-30T11:32:40Z

Thank you very much, with this snippet I got everything to work as expected!

In my opinion it makes sense to add this code to PR. After all, this is an EXI de-/encoder and every user will expect to be able to de-/encode the EXI datatypes. Since the function already exists anyway, it is not much effort to add it.

SiebrenW · 2024-01-30T14:56:22Z

Thank you very much, with this snippet I got everything to work as expected!

In my opinion it makes sense to add this code to PR. After all, this is an EXI de-/encoder and every user will expect to be able to de-/encode the EXI datatypes. Since the function already exists anyway, it is not much effort to add it.

I would first like to write a to_unsigned before I include that, if I can find the time to make it, so users can also import larger numbers like that. Otherwise we can only test hardcoded examples and can't even terrify the EVs that most likely won't support anything over 64 bits (if that) anyways.

I thought about actually making a decimal string (basically so it's closer to XML), but briefly thinking about the logic needed I figured that would take too long. Maybe I will at some point, seems like a fun challenge.

SiebrenW · 2024-01-31T09:59:57Z

I added the bytes to and from unsigned procedures now. do notice that I get an extra byte when converting back to bytes, but for printing that's probably fine and I have spent too much of my break on this already 😐 .

barsnick · 2024-02-02T07:24:59Z

We will take a look at this solution, and merge it, if it works for "everyone".

I haven't looked at the final details of the recent force-push, but obviously prefer an API representation as a (host-endian?) byte array representation of the large integer.

Regarding extra bytes, I will write a note in your other pull request. I think there may be misunderstandings regarding the counting of stream bits and bytes in cbV2G/cbExiGen.

TGruett · 2024-02-13T11:58:42Z

I've been indisposed for the last few days and have finally got the chance to test it again. Everything works fine for me. I only get an uninitialised warning for the uint16_t dummy in the function exi_basetypes_convert_bytes_to_unsigned()

barsnick · 2024-02-13T13:07:05Z

finally got the chance to test it again. Everything works fine for me.

Thanks for looking at it. We will review it and eventually merge it.

I only get an uninitialised warning for the uint16_t dummy in the function exi_basetypes_convert_bytes_to_unsigned()

That needs to be fixed.

SiebrenW · 2024-02-13T14:13:21Z

I default initialised the dummy. I should add more warnings+werror in my test build script.

TGruett · 2024-04-08T05:52:30Z

Good Morning, what is the status of this PR? I'm just wondering if it's still planned to be merged. I am already using this big int implementation and it is working perfectly so far.

SebaLukas · 2024-10-10T12:17:40Z

This PR also helped us a lot today at the Testival in France.
But I still had to set EXI_BASETYPES_MAX_OCTETS_SUPPORTED to 30. The SerialNumbers of Hubject certificates are simply gigantic.

That's why I think the PR is great :) and would merge it directly if I don't find anything next week.

SebaLukas

Thank you very much again for your PR. I only found a few small things.

src/input/code_templates/c/static_code/exi_basetypes_encoder.c.jinja

src/input/code_templates/c/static_code/exi_basetypes_decoder.c.jinja

src/input/code_templates/c/static_code/exi_basetypes.h.jinja

Signed-off-by: Siebren Weertman <[email protected]> Signed-off-by: Siebren Weertman <[email protected]>

Signed-off-by: Siebren Weertman <[email protected]> const correctness and remove log Signed-off-by: Siebren Weertman <[email protected]> init dummy Signed-off-by: Siebren Weertman <[email protected]>

SebaLukas · 2024-10-11T09:36:31Z

Your last commit has no signed-off-by. Thats the reason why DCO fails.

barsnick

I like the way the code was expanded to generic integers. Thank you for the effort.

I inserted a few style remarks.

Regression testing was fine. I didn't try any EXI streams with large X509Serials yet.

But one thing I am wondering: exi_basetypes_convert_bytes_{from,to}_unsigned() seems to insert/remove the EXI 7-bit integer encoding, which I was hoping you would add.

Yet this operation is not automatically applied to the result placed into the exi_unsigned_t. IMHO, an API which requires you to shuffle a large integer "word" memory block into 7-bit slots (or vice-versa) seems unacceptable to me, it does not meet the rule of least astonishment. 😉

Can we put these calls into src/input/code_templates/c/decoder/{En,De}codeTypeSigned.jinja?

src/input/code_templates/c/static_code/exi_basetypes.c.jinja

src/input/code_templates/c/static_code/exi_basetypes.h.jinja

SebaLukas · 2024-10-18T08:26:12Z

@SiebrenW Would you still work on @barsnick points? I would like to merge this PR soon :)

SiebrenW · 2024-10-18T18:27:20Z

@SiebrenW Would you still work on @barsnick points? I would like to merge this PR soon :)

I have been busy this week, sorry, but I can have a go this evening.

Signed-off-by: Siebren <[email protected]>

SiebrenW · 2024-10-18T19:35:02Z

I have applied most feedback. Would you check again?

regarding the copy or const pointer to the struct, I naively want to save some stack memory but I didn't check the instructions or anything. Just your typical "how much bigger than a pointer?" fingerspitzengefühl.

barsnick · 2024-10-18T23:15:43Z

The integration of my review comments looks fine so far.

regarding the copy or const pointer to the struct, I naively want to save some stack memory but I didn't check the instructions or anything

That's probably fine as it is.

Would you accept a proposal where I intergrate the 7/8-8/7 stuffing and destuffing into the API, to make it a clear "integer in little-endian octet order" array without any stuffing? I have something prepared (for Monday).

barsnick · 2024-10-21T13:01:56Z

You can try to use that for now. It does happen to reverse the order of the bytes if you don't mind, but I only had a 30 minute break.

That's fine. The "reversal" just means it's little-endian, which is okay if it's documented.

I may add this to the PR if this is somehow essential, but currently I'm wondering if it is. This is just for stringifying it, but the serialnumber is for identification and the 7-bit octets do that just fine.

I don't agree. If you need to encode an existing value, you want to put in the right one. On the other hand, perhaps the value isn't really used for anything. I still would prefer to have a straight-forward type in the API, not some obscure reflection of EXI internals.

By the way, by design, the 7-to-8 decoder sometimes adds an extra 0x00 byte at the end, which is insignificant since it's MSB. We can leave it right now, but a nice optimization would be to drop it (unless it's the only value in the byte array).

Let me try to comment my proposed changes in the PR. You can also look at https://github.com/EVerest/cbexigen/commits/feature/use_big_int_for_integer_fixup/, where I modeled them.

src/input/code_templates/c/static_code/exi_basetypes_decoder.c.jinja

src/input/code_templates/c/static_code/exi_basetypes_encoder.c.jinja

barsnick

I'll put the remaining changes into a separate pull request.

SiebrenW force-pushed the feature/use_big_int_for_integer branch from ca8ab5f to 286638b Compare January 17, 2024 12:40

SiebrenW mentioned this pull request Jan 19, 2024

Datatype of X509SerialNumber #54

Closed

SiebrenW force-pushed the feature/use_big_int_for_integer branch from a507b6d to d5ddbc4 Compare January 31, 2024 09:15

SiebrenW force-pushed the feature/use_big_int_for_integer branch 2 times, most recently from 03d046a to 272dee6 Compare January 31, 2024 13:34

barsnick mentioned this pull request Feb 2, 2024

Feature/improved bitstream efficiency #66

Open

SiebrenW force-pushed the feature/use_big_int_for_integer branch from 272dee6 to 29ad531 Compare February 13, 2024 14:11

SebaLukas assigned barsnick and chausGit Apr 16, 2024

SebaLukas assigned SebaLukas and unassigned barsnick and chausGit Oct 10, 2024

SebaLukas approved these changes Oct 11, 2024

View reviewed changes

SiebrenW force-pushed the feature/use_big_int_for_integer branch from 29ad531 to c0f4721 Compare October 11, 2024 09:00

SiebrenW requested a review from barsnick as a code owner October 11, 2024 09:00

SiebrenW requested a review from chausGit as a code owner October 11, 2024 09:00

Siebren Weertman added 3 commits October 11, 2024 11:01

Use big int for integer type without max value

b751dd2

Signed-off-by: Siebren Weertman <[email protected]> Signed-off-by: Siebren Weertman <[email protected]>

fix decoder for signed datatype with non-strict exi padding

243ad56

Signed-off-by: Siebren Weertman <[email protected]> Signed-off-by: Siebren Weertman <[email protected]>

add conversion for byte to/from unsigned (bigint) datatype

b76f818

Signed-off-by: Siebren Weertman <[email protected]> const correctness and remove log Signed-off-by: Siebren Weertman <[email protected]> init dummy Signed-off-by: Siebren Weertman <[email protected]>

SiebrenW force-pushed the feature/use_big_int_for_integer branch 2 times, most recently from be8d1f7 to 254153e Compare October 11, 2024 09:07

SiebrenW force-pushed the feature/use_big_int_for_integer branch from 254153e to c779a3e Compare October 11, 2024 09:38

barsnick requested changes Oct 15, 2024

View reviewed changes

increase int buffer and code cleanup

de69116

Signed-off-by: Siebren <[email protected]>

SiebrenW force-pushed the feature/use_big_int_for_integer branch from c779a3e to de69116 Compare October 18, 2024 19:28

barsnick reviewed Oct 21, 2024

View reviewed changes

src/input/code_templates/c/static_code/exi_basetypes_decoder.c.jinja Show resolved Hide resolved

barsnick reviewed Oct 21, 2024

View reviewed changes

src/input/code_templates/c/static_code/exi_basetypes_encoder.c.jinja Show resolved Hide resolved

SebaLukas requested a review from barsnick October 24, 2024 10:03

barsnick approved these changes Oct 24, 2024

View reviewed changes

Merge branch 'main' into feature/use_big_int_for_integer

6781184

SebaLukas merged commit d0b5310 into EVerest:main Oct 24, 2024

barsnick mentioned this pull request Oct 24, 2024

Represent big integers as arrays of bytes #87

Merged

3 tasks

Use big int for integer type without max value #65

Use big int for integer type without max value #65

Uh oh!

Conversation

SiebrenW commented Jan 17, 2024

Uh oh!

SiebrenW commented Jan 18, 2024

Uh oh!

TGruett commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SiebrenW commented Jan 23, 2024

Uh oh!

TGruett commented Jan 24, 2024

Uh oh!

TGruett commented Jan 25, 2024

Uh oh!

SiebrenW commented Jan 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TGruett commented Jan 30, 2024

Uh oh!

SiebrenW commented Jan 30, 2024

Uh oh!

SiebrenW commented Jan 31, 2024

Uh oh!

barsnick commented Feb 2, 2024

Uh oh!

TGruett commented Feb 13, 2024

Uh oh!

barsnick commented Feb 13, 2024

Uh oh!

SiebrenW commented Feb 13, 2024

Uh oh!

TGruett commented Apr 8, 2024

Uh oh!

SebaLukas commented Oct 10, 2024

Uh oh!

SebaLukas left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SebaLukas commented Oct 11, 2024

Uh oh!

barsnick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SebaLukas commented Oct 18, 2024

Uh oh!

SiebrenW commented Oct 18, 2024

Uh oh!

SiebrenW commented Oct 18, 2024

Uh oh!

barsnick commented Oct 18, 2024

Uh oh!

barsnick commented Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

barsnick left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

TGruett commented Jan 22, 2024 •

edited

Loading

SiebrenW commented Jan 25, 2024 •

edited

Loading

barsnick commented Oct 21, 2024 •

edited

Loading